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Final Report 

Operator Method Digital Optical Computers 

1.0 Introduction: 

Discussion or ultimate pekfoemance umtts or digital optical comfutbu 

Digital optical computer design has been focused primarily towards "foraller implementation as shown in Figure 
As shown, these typical machines have two planes of inputs and one output plane. Input planes for the “A" and “IT 
iopmshave been implemented with various fonna of spatial light modulate*. Multichannel acousuwpdc devices a. 
used primarily due to the device performance and availability. An example can be found in reference 2. 

We refer to parallel in the strict sense of single point-to^oint interconnection as shown in the Figure. This.yp.of 
architecture isthesimplesttoimplementin hardware duetotheabilityof losses to simply imagepointsonaninputplane 
to points on a second image and again simply image this birnuy pmduc. to an output detection plane. 

In terms of expected perfonnance. Figum 2 compare, this type of amhitecrum to cunently developing VHSIC 
systems. Using demonsttated multichannel acousur optic devices, a figure of merit can be formulated. Herewefocus 
on a figum of merit named "flam Interconnect Bandwidth Eroded" or GIBP. This is equivalently the number of two 

input gates connected together times their utilization pec second. As can be seen in figure 2. for the multichannel acousto 

optic device, the number of effective gates is calculated to be 16,384 or simply the total interconnect of two 32 x 512 


Light 



UAH SUB 89-116 


2 







clun ra U a Md*M2f ro m4>«Um e .b aI ,d.i d <hpn rfu ctor„um- 

bcr of resolution dementi in each channel. Since these devices can be clocked al 100 MHz (or 10ns effective gate time) 

then the total GIBP is calculated as 1.6x 10 l *. We feel diti represents a trie measure of si«d. VHSIC chips today may 

exMbfc in excess of 1<P gales/chip widl clock speed approaching 10ns(HP HZ). Titus one can achieve VHSIC perform- 

ance , ,0-GBP. Once again algorithmic efficiency effects dm total perfonnance but from the simple GIBP compari- 
son, one can see dial parallel optical implementations of digital computers barely, if at aO, competes with semiconductor 
VHSIC devices with respect to GIBP. 


digital technology 



Parallel Optica: 

VHSIC: 

£ 

32x512 A.O. 

2 

16,384 gate* {128 } 

10 5 gates 

I 

10® Hz (10ns dock) 

10 8 Hz (lOnadock) 


1.8 xIO 12 GIBP* 

10 13 GIBP* 


1000x1000 SIM 
10 6 gates 

10 6 gates 

| 

10 9 Hz (Ins clock) 

10 8 Hz (10ns clock) 

£ 

10 15 GIBP* 

10 14 GIBP* 


Figure 2: Perfonnance 
comparison illustration 
of Parallel Optical im- 
plementation vs. 


VHSIC. 


‘Gate Interconnect Bandwidth Product 


Conventional thinking in the optical computing community has been to improve the input spatial light modulators 
(SLMs). A great deal of work in this area includes the work at 1.) U. Colorado in the area of Ferroelectric Liquid 
Crystals"* 2.) AT&T Bell Labs in the area of quantum well SLMS"". 3.) Texas Instruments in the area of membrane 
light modulators"*, and many others. In all cases, the objectives are to produce devices which will ultimately allow 
1000x1000 pixel perfonnance. Given that this someday is accomplished at equivalent clock speeds and greater, very 
optimistically view 1 ns, then the ultimate limit of parallel optical digital computing systems can only reach a GIBP of 
10” per computer. 100 to 1000 VHSIC chips are required today to achieve the same computational complexity. 

It is therefore our opinion that conventional pmM optical digital computer architecture demonstrates only 
marginal competetiveness at best when compared to projected semiconductor implementations. 
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2.0 The Opportunity op Gloral Interconnects 

however «e no. limited to ' “parallel- interconnects. As shown in figure 3, every point at die first 

input plane can be connected to every point in the second input plane which can be subsequently connected toevery point 

in the output plane. 


Light 



computing. 

This type of configuration is referred to as a “full global" interconnect. 

Clearly several advantages can be seen. Global optical interconnects can cross optical paths and no cross talk will 
be observed. This type of interconnect is clearly extremely difficult with semiconductor technology due to inductive and 
capacitive cross talk problems especially at high clock rates. Another advantage is the ability to achieve extremely high 
fan-in on the detectors. There are no capacitive loading effects as seen in semiconductor technology. Extremely large 
fan-in’s are projected for optics (>1000:1). where as in semiconductor technology greater than 10 is difficult. Conse- 
quendy , global optical technology appears to be well suited for “wide word” processing. Thus the tradeoff leans towards 

larger multi-input gates and fewer gate delays. 

The largest advantage to global interconnect ii the large improvement potential in gate interconnect bandwidth aa 
can be seen in figure 4. Even with today's available and matwe spatial light modulatom like die one described earlier. 
i.e. a 32 channel acousto optic device with a time bandwith pmduct per channel of 512. at a 10ns clock rate the resul- 
tant GIBE that can be achieved will approach 2.7 a Iff*! This, when compered to current VHSIC technology, represents 
over 3 orders of magnitude improvement ovcradenseVHSIC chipconfigutedat 100 MHZ . Another way of expressing 
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Global Optic*: 

VHSIC: 

$ 

32*512 AO. davtc** 2 

2M.43S.4M gate* { 128 2 } 

10 5 gat** 

- - . 

I 

10 8 Hz (1 On* dock) 

10 8 Hz (lOn* clock) 


2.7x10 1# GIBP* 

10 13 GIBP* 


1000x1000 SLM 2 

10 12 gat** {1000 2 } 

10 6 gat** 

1 

10 9 Hz (Ins dock) 

10 8 Hz ( 10 r» clock) 


10 21 GIBP* 

10 14 GIBP* 


’Gat* Interconnect Bandwidth Propel 


Figure 4: Performance 
comparison illustration of 
"full global" interconnect 
implementation vs. 


VHSIC 


this improvement is to consider the optical system to be equivalent to 2700 VHSIC chips. 

If 1000x1000 element spatial light modulators are indeed ever developed that operate at 1 ns. the GIBP potential 
of digital optical computers could ultimately approach 10“ or 7 orders of magnitude improvement potential. 


3.0 Thermal and other limits 

Although the utilization of global interconnects clearly shows great potential in terms of projected throughput/com- 
pute capability, optical computing systems offer in addition the potential of extremely low power dissipation as com- 
pared to semiconductor technology. 

By using current optical technology, i.e. acoustooptic devices and avalanche photodiode arrays, photon budgets per 
event can approach theoretical limits. For example a 1000 photon threshold represents 6x10* kT at 300°K thereby 
approaching within a factor of 60kT per photon per event Current semiconductor technology requires at best 2 orders 
of magnitude andoo average 4 orders of magnitude and atmosttiorders of magnitude more powerperbitas can be seen 
on figure 5 compiled from references 13-17. 

Consider 1.000 photons per event To achieve the "theoretical" limit GIBP of 10“ significant optical power is 
required. Specifically, 10“ GIBP multiplied by 1 ,000 photons per event yields 10” photons per second. A 1 watt .8 1 
m source will deliver 4.075x10“ photons per second. Therefore to achieve 10” photons per second without consid- 
eration for losses in the system such as diffraction efficiency of the acousto optic devices, detector responsivity and 
various other losses, a total power budget of 10” + 4.075xl0“ - 245.398 watts of power! The conclusion here is that we 
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ORIGINAL PAGE iS 

OF POOR QUALITY 

do not feel that we will ever be able to fully exploit all the power in optical interconnects, i.e. ever obtain a GIBP of 10 21 ! 

So what is a reasonable performance projection? The data from Figure 5 can be plotted as shown on figure 6 tided 
GIBP versus power. Dearly the most competitive technology is that of GaAs. The GaAs technology boundary as shown 
in the figure allows GaAs to have the maximum allowable leverage. The tine is drawn with the assumption that standard 
gate propagation delays of 100 ps can be used as the clock value, an assumption requiring a 40 GHz bandwidth @ RTZ 
format As shown on the graph it may be possible to achieve approximately 10“ GIBP with at a power consumpuon of 

2 to 3 KWatts. 

Notice that the optical device curve at 100 % efficiency is at least 3 orders of magnitude better. Our current proto- 
type, the DOC - 1 (digital optical computer) is designed to operate at a GIBP of approximately 10 11 and is shown accord- 
ingly on the graph. For the moment ignore the of the Bragg cell power consumption (approximately 32 watts) and the 
detector transimpedance amplifier / threshold circuitry (another 64 watts). Looking only at the photon budget require- 
ment using TeOj typical diffraction efficiencies (here a multiplicative efficiency of .32% is assumed), then the power 
consumption of 50 mw is already superior to GaAs technology. In addition, the substitution of GaP Bragg cells which 
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decrease the inefficiency to 12% shows an optical power consumption on the order of 1 mw. 

Unfoftunaiely.0* ,ee,,,no| i8 ,l0<tlhe ^" vean< * de,eC, ^ OneleCtr0niCS- So letusgobackio the above question of what 
is a reasonable projection. Itappmm front Ore graph freadigitalopticalcon^te.toteclearlycompeUtiveitmus.h..e 

ata minimum ihe following specifications: GBP> 10 ». gale efficiency > l*,ami.drive/d.mcmrp»we,consump. 
lion of less than 100W. 



Figure 6: GIBP vs. Power consumption. 


ORIGINAL PAGE IS 
OF POOR QUALITY 


4.0 Analog Global: 

The use of global interconnects in analog optical computing is not new to the field. For example, as early as 1964, 

A.B. Vander Lugt invented the optical correlator as shown in figure 7. [ref. 1 8] 

The second lens produces the Fourier transform of the input at the matched filter plane. The operation of Fourier 
transformation is in and of itself a global interconnect operation in two dimensions. For example if the input is a point 
source, the distribution in the Fourier plane is a plane wave. Thus the system globally broadcasts the light from the point 
source to all points in the Fourier plane. Consequently, if the input is considered as an array of point sources, one can 
clearly see how this system performs a full global interconnect. 
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multiplication will, the matched fdter dm Mi lens again produces a Fourier uansfotm. This time the 

FOurier tiansfotm of the product of the Fourier transfonn of the input times the matched filter is produced at the outpuL 

This is commonly referred to as the correlation function. Clearly, this system can never be beat with digital electronics 
because full global interconnects art used. The question is how to utilize this "correlator” type architecture in the digital 

regieme efficiently. 



Figure 7: The analog Vander Lugt optical correlator utilizes full global interconnects. 

5.0 Quasi-digital: 

Figure 8 shows a planar global interconnect between two linear spatial light modulators and die output plane. 
If two digital words are placed respectively at the two input planes an interesting phenomenon occurs. 

FOr example, in the figure 7 thrombi. wools and B(b',bW am pl**d a. the two input 



a 3 b 3 = c 5 
a 2 b 3 ♦ a 3 b 2 = c 4 

a’b 3 * a 2 b 2 ♦ aW = c 3 

a’b 1 * a 2 b 2 = c 2 
a’b 1 = c 1 


Figure 8i Flash Digital Multiplication by Analog Convolution (DMAO b, utilising full global interne 
nects. 


UAH SUB 89-116 


8 




with full global interconnects. Five equations are produced as follows: 


planes as shown. Notice that. 


a 3 b 3 = c 5 


aV ♦ a 3 b 2 = c 4 


aV ♦ a 2 b 2 ♦ a 3 b 1 = c 3 

aV * a 2 b 2 = c 2 
a 1 ^ = c 1 


Notice that these five equations produce the same exact answer as the DMAC algorithm, (digital multiplication 
by analog convolution) as shown in figure 8. We do not, propose to persue this, path. However, what is important is that 
" full global" interconnec t* produces the convolution of the bVO vector tllPUl S- And it produces this full convolution in 

one clock cycle. 


a 1 

a 2 

a 3 

b 1 

b 2 

b 3 

’ a’b^ 

a 2 b 3 ! 

l 3 ? 

+ : 

* a’b* ! a 2 b 2 

l ' i 

+ * 

a 3 b^ 

1 


» + * . + ' 

1 


•a 1 b 1 * 'a 2 ^ » ; a 3 b 1 ! 

l 

1 




c 5 


Figure 9: Cany less digital multiplication by analog convolution algorithm 
6.0 Full Digital: 

Now the question becomes, what happensifinstead of using the detectotsasumming nodes as in the quasi-digital 

case above, we use die delectoo as Boolean summing devices. i.e. a thresholding device or an "or” gate. Another way 
of stating this question is what digital primitives are represented by digital convolution with a digital threshold? 
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In figure 10. the outputs are all placed onto a single detector. The detector is used as an "ORing” device which 
produces either a one or zero to the output gate which subsequently inverts the result. 

Mathmatically. the output can now be written as: 

* 2^0 + . 

a,b 0 + a 2* , l + 

O = aob 0 + Sjbj + + 

ajtf + a > b 2 + 

a 0 b 2 


Figure 10: Full Global full digital with 
single Boolean sum detector 



This can be expressed, after algebraic grouping as: 

= b 0 (a Q + a t + aj) + b^+aj+aj) + bjCag+aj+a^ 
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This subsequently becomes: 


= (b 0 + b, + b 2 Xao + a, + a^> 

Hie critical key ro understanding the significance to the expression comes by applying DeMorgaris Law. DeMorgans's 
law states: 

X + y = XT 

Consequently after application of DeMorgan’s Law the output becomes: 

= 

After output inversion by the output gate the final result can be written as: 

= b^ b^ b 2 + &Q a, % 

If the inputs are driven with the inversions of the bits instead of the bits themselves then the output can be written : 

- b 0 bj b 2 +^^32 

Consequently, Ml glottal interconnect effectively produces the digital logic primitive of two N-bit wide AND gams 

followed by the OR-invert operation as shown in figure 11. 

If mere lhan two SLMs are cascaded the number ofN input AND gates feeding [he OR gate grewt as tk* number 

of SIMs- As can be seen bon figure 1 1 , the global interconnect primitive is similarto the parallel interconnect primitive 

as described in reference 4 with the difference that the global interconnect primitive is far more powerful. Ibepamllel 

interconnect primitive is essential an stray of 2 input AND gales followed by a multiple input OR gate. Here we have 


UAH SUB 89-116 


11 



muU,*iW>«ANDs»eepd>iliWli>l>o*^ Effect, .«« 

grvJuadiv ftcni the iMny selection of nuiuenn functional* to ihe arbtory selecuou of 1* sum of muuerm 

functionals. 



Figure 11: Full Global digital optical primitive for 2 level SLM cascade 
7.0 Conclusion: 

Digital optical computing is becoming a very tough competitor to semiconductor technology since it can support 
a very high degree of three dimensional interconnect density and high degrees of Fan-In without capacitive loading 
effects at very low power consumption levels. 
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